426 research outputs found

    New approaches to interactive multimedia content retrieval from different sources

    Get PDF
    Mención Internacional en el título de doctorInteractive Multimodal Information Retrieval systems (IMIR) increase the capabilities of traditional search systems with the ability to retrieve information in different types (modes) and from different sources. The increase in online content while diversifying means of access to information (phones, tablets, smart watches) encourages the growing need for this type of system. In this thesis a formal model for describing interactive multimodal information retrieval systems querying various information retrieval engines has been defined. This model includes formal and widespread definition of each component of an IMIR system, namely: multimodal information organized in collections, multimodal query, different retrieval engines, a source management system (handler), a results management module (fusion) and user interactions. This model has been validated in two stages. The first, in a use case focused on information retrieval on sports. A prototype that implements a subset of the features of the model has been developed: a multimodal collection that is semantically related, three types of multimodal queries (text, audio and text + image), six different retrieval engines (question answering, full-text search, search based on ontologies, OCR in image, object detection in image and audio transcription), a strategy for source selection based on rules defined by experts, a strategy of combining results and recording of user interactions. NDCG (normalized discounted cumulative gain) has been used for comparing the results obtained for each retrieval engine. These results are: 10,1% (Question answering), 80% (full text search) and 26;8% (ontology search). These results are on the order of works of the state of art considering forums like CLEF. When the retrieval engine combination is used, the information retrieval performance increases by a percentage gain of 771,4% with question answering, 7,2% with full text search and 145,5% with Ontology search. The second scenario is focused on a prototype retrieving information from social media in the health domain. A prototype has been developed which is based on the proposed model and integrates health domain social media user-generated information, knowledge bases, query, retrieval engines, sources selection module, results' combination module and GUI. In addition, the documents included in the retrieval system have been previously processed by a process that extracts semantic information in health domain. In addition, several adaptation techniques applied to the retrieval functionality of an IMIR system have been defined by analyzing past interactions using decision trees, neural networks and clusters. After modifying the sources selection strategy (handler), the system has been reevaluated using classification techniques. The same queries and relevance judgments done by users in the sports domain prototype will be used for this evaluation. This evaluation compares the normalized discounted cumulative gain (NDCG) measure obtained with two different approaches: the multimodal system using predefined rules and the same multimodal system once the functionality is adapted by past user interactions. The NDCG has shown an improvement between -2,92% and 2,81% depending on the approaches used. We have considered three features to classify the approaches: (i) the classification algorithm; (ii) the query features; and (iii) the scores for computing the orders of retrieval engines. The best result is obtained using probabilities-based classification algorithm, the retrieval engines ranking generated with Averaged-Position score and the mode, type, length and entities of the query. Its NDCG value is 81,54%.Los Sistemas Interactivos de Recuperación de Información Multimodal (IMIR) incrementan las capacidades de los sistemas tradicionales de búsqueda con la posibilidad de recuperar información de diferentes tipos (modos) y a partir de diferentes fuentes. El incremento del contenido en internet a la vez que la diversificación de los medios de acceso a la información (móviles, tabletas, relojes inteligentes) fomenta la necesidad cada vez mayor de este tipo de sistemas. En esta tesis se ha definido un modelo formal para la descripción de sistemas de recuperación de información multimodal e interactivos que consultan varios motores de recuperación. Este modelo incluye la definición formal y generalizada de cada componente de un sistema IMIR, a saber: información multimodal organizada en colecciones, consulta multimodal, diferentes motores de recuperación, sistema de gestión de fuentes (handler), módulo de gestión de resultados (fusión) y las interacciones de los usuarios. Este modelo se ha validado en dos escenarios. El primero, en un caso de uso focalizado en recuperación de información relativa a deportes. Se ha desarrollado un prototipo que implementa un subconjunto de todas las características del modelo: una colección multimodal que se relaciona semánticamente, tres tipos de consultas multimodal (texto, audio y texto + imagen), seis motores diferentes de recuperación (búsqueda de respuestas, búsqueda de texto completo, búsqueda basada en ontologías, OCR en imagen, detección de objetos en imagen y transcripción de audio), una estrategia de selección de fuentes basada en reglas definidas por expertos, una estrategia de combinación de resultados y el registro de las interacciones. Se utiliza la medida NDCG (normalized discounted cumulative gain) para describir los resultados obtenidos por cada motor de recuperación. Estos resultados son: 10,1% (Question Answering), 80% (Búsqueda a texto completo) y 26,8% (Búsqueda en ontologías). Estos resultados están en el orden de los trabajos del estado de arte considerando foros como CLEF (Cross-Language Evaluation Forum). Cuando se utiliza la combinación de motores de recuperación, el rendimiento de recuperación de información se incrementa en un porcentaje de ganancia de 771,4% con Question Answering, 7,2% con Búsqueda a texto completo y 145,5% con Búsqueda en ontologías. El segundo escenario es un prototipo centrado en recuperación de información de medios sociales en el dominio de salud. Se ha desarrollado un prototipo basado en el modelo propuesto y que integra información del dominio de salud generada por el usuario en medios sociales, bases de conocimiento, consulta, motores de recuperación, módulo de selección de fuentes, módulo de combinación de resultados y la interfaz gráfica de usuario. Además, los documentos incluidos en el sistema de recuperación han sido previamente anotados mediante un proceso de extracción de información semántica del dominio de salud. Además, se han definido técnicas de adaptación de la funcionalidad de recuperación de un sistema IMIR analizando interacciones pasadas mediante árboles de decisión, redes neuronales y agrupaciones. Una vez modificada la estrategia de selección de fuentes (handler), se ha evaluado de nuevo el sistema usando técnicas de clasificación. Las mismas consultas y juicios de relevancia realizadas por los usuarios en el primer prototipo sobre deportes se han utilizado para esta evaluación. La evaluación compara la medida NDCG (normalized discounted cumulative gain) obtenida con dos enfoques diferentes: el sistema multimodal usando reglas predefinidas y el mismo sistema multimodal una vez que la funcionalidad se ha adaptado por las interacciones de usuario. El NDCG ha mostrado una mejoría entre -2,92% y 2,81% en función de los métodos utilizados. Hemos considerado tres características para clasificar los enfoques: (i) el algoritmo de clasificación; (ii) las características de la consulta; y (iii) las puntuaciones para el cálculo del orden de los motores de recuperación. El mejor resultado se obtiene utilizando el algoritmo de clasificación basado en probabilidades, las puntuaciones para los motores de recuperación basados en la media de la posición del primer resultado relevante y el modo, el tipo, la longitud y las entidades de la consulta. Su valor de NDCG es 81,54%.Programa Oficial de Doctorado en Ciencia y Tecnología InformáticaPresidente: Ana García Serrano.- Secretario: María Belén Ruiz Mezcua.- Vocal: Davide Buscald

    A Proof-of-Concept for Orthographic Named Entity Correction in Spanish Voice Queries

    Get PDF
    Proceedings of: 10th International Workshop on Adaptive Multimedia Retrieval. Took place October 24-25, 2012, in Copenhaguen (Denmark).Automatic speech recognition (ASR) systems are not able to recognize entities that are not present in its vocabulary. The problem considered in this paper is the misrecognition of named entities in Spanish voice queries introducing a proof-of-concept for named entity correction that provides alternative entities to the ones incorrectly recognized or misrecognized by retrieving entities phonetically similar from a dictionary. This system is domain-dependent, using sports news, especially football news, regardless of the automatic speech recognition system used. The correction process exploits the query structure and its semantic information to detect where a named entity appears. The system finds the most suitable alternative entity from a dictionary previously generated with the existing named entities.This work has been partially supported by the Regional Government of Madrid under the Research Network MA2VICMR (S2009/TIC-1542) and by the Spanish Center for Industry Technological Development (CDTI, Ministry of Industry, Tourism and Trade) through the BUSCAMEDIA Project (CEN-20091026).Publicad

    An Illustrated Methodology for Evaluating ASR Systems

    Get PDF
    Proceeding of: 9th International Workshop on Adaptive Multimedia Retrieval (AMR 2011) Took place 2011, July, 18-19, in Barcelona, Spain. The event Web site is http://stel.ub.edu/amr2011/Automatic speech recognition technology can be integrated in an information retrieval process to allow searching on multimedia contents. But, in order to assure an adequate retrieval performance is necessary to state the quality of the recognition phase, especially in speaker-independent and domainindependent environments. This paper introduces a methodology to accomplish the evaluation of different speech recognition systems in several scenarios considering also the creation of new corpora of different types (broadcast news, interviews, etc.), especially in other languages apart from English that are not widely addressed in speech community.This work has been partially supported by the Spanish Center for Industry Technological Development (CDTI, Ministry of Industry, Tourism and Trade), through the BUSCAMEDIA Project (CEN-20091026). And also by MA2VICMR: Improving the access, analysis and visibility of the multilingual and multimedia information in web for the Region of Madrid (S2009/TIC-1542).Publicad

    Proof of Concept of Ontology-based Query Expansion on Financial Domain

    Get PDF
    Este trabajo presenta el uso de una ontología en el dominio financiero para la expansión de consultas con el fin de mejorar los resultados de un sistema de recuperación de información (RI) financiera. Este sistema está compuesto por una ontología y un índice de Lucene que permite recuperación de conceptos identificados mediante procesamiento de lenguaje natural. Se ha llevado a cabo una evaluación con un conjunto limitado de consultas y los resultados indican que la ambigüedad sigue siendo un problema al expandir la consulta. En ocasiones, la elección de las entidades adecuadas a la hora de expandir las consultas (filtrando por sector, empresa, etc.) permite resolver esa ambigüedad.This paper explains the application of ontologies in financial domains to a query expansion process. The final goal is to improve financial information retrieval effectiveness. The system is composed of an ontology and a Lucene index that stores and retrieves natural language concepts. An initial evaluation with a limited number of queries has been performed. Obtained results show that ambiguity remains a problem when expanding a query. The filtering of entities in the expansion process by selecting only companies or references to markets helps in the reduction of ambiguity.Este trabajo ha sido parcialmente financiado por el proyecto Trendminer (EU FP7-ICT287863) , el proyecto Monnet (EU FP7-ICT 247176) y MA2VICMR (S2009/TIC-1542).Publicad

    Some Experiments in Evaluating ASR Systems Applied to Multimedia Retrieval

    Get PDF
    Proceedings of: 7th International Workshop on Adaptive Multimedia Retrieval (AMR 2009). Took place 2009, September 24-25, in Madrid. The event Web site is http://nlp.uned.es/amr2009/This paper describes some tests performed on different types of voice/audio input applying three commercial speech recognition tools. Three multimedia retrieval scenarios are considered: a question answering system, an automatic transcription of audio from video files and a real-time captioning system used in the classroom for deaf students. A software tool, RET (Recognition Evaluation Tool), has been developed to test the output of commercial ASR systems.This research work has been supported by the Regional Government of Madrid under the Research Network MA2VICMR (S2009/TIC-1542) and by the Spanish Ministry of Education under the project BRAVO (TIN2007-67407-C03-01).Publicad

    Are Passages Enough? The MIRACLE Team Participation at QA@CLEF2009

    Get PDF
    Preceedins of: 10th Workshop of the Cross-Language Evaluation Forum, CLEF 2009.Took place September 30 - October 2, 2009,in Corfu, Greece. The event Web site is http://www.clef-campaign.org/2009.htmlThis paper summarizes the participation of the MIRACLE team in the Multilingual Question Answering Track at CLEF 2009. In this campaign, we took part in the monolingual Spanish task at ResPubliQA and submitted two runs. We have adapted our QA system to the new JRC-Acquis collection and the legal domain. We tested the use of answer filtering and ranking techniques against a baseline system using passage retrieval with no success. The run using question analysis and passage retrieval obtained a global accuracy of 0.33, while the addition of an answer filtering resulted in 0.29. We provide an analy-sis of the results for different questions types to investigate why it is difficult to leverage previous QA techniques. Another task of our work has been the appli-cation of temporal management to QA. Finally we include some discussion of the problems found with the new collection and the complexities of the domain.This work has been partially supported by the Research Network MAVIR (S-0505/TIC/000267) and by the project BRAVO (TIN2007- 67407-C3-01).Publicad
    • …
    corecore